71 research outputs found

    DRackSim: Simulator for Rack-scale Memory Disaggregation

    Full text link
    Memory disaggregation has emerged as an alternative to traditional server architecture in data centers. This paper introduces DRackSim, a simulation infrastructure to model rack-scale hardware disaggregated memory. DRackSim models multiple compute nodes, memory pools, and a rack-scale interconnect similar to GenZ. An application-level simulation approach simulates an x86 out-of-order multi-core processor with a multi-level cache hierarchy at compute nodes. A queue-based simulation is used to model a remote memory controller and rack-level interconnect, which allows both cache-based and page-based access to remote memory. DRackSim models a central memory manager to manage address space at the memory pools. We integrate community-accepted DRAMSim2 to perform memory simulation at local and remote memory using multiple DRAMSim2 instances. An incremental approach is followed to validate the core and cache subsystem of DRackSim with that of Gem5. We measure the performance of various HPC workloads and show the performance impact for different nodes/pools configuration

    Embedding Security into Ferroelectric FET Array via In-Situ Memory Operation

    Full text link
    Non-volatile memories (NVMs) have the potential to reshape next-generation memory systems because of their promising properties of near-zero leakage power consumption, high density and non-volatility. However, NVMs also face critical security threats that exploit the non-volatile property. Compared to volatile memory, the capability of retaining data even after power down makes NVM more vulnerable. Existing solutions to address the security issues of NVMs are mainly based on Advanced Encryption Standard (AES), which incurs significant performance and power overhead. In this paper, we propose a lightweight memory encryption/decryption scheme by exploiting in-situ memory operations with negligible overhead. To validate the feasibility of the encryption/decryption scheme, device-level and array-level experiments are performed using ferroelectric field effect transistor (FeFET) as an example NVM without loss of generality. Besides, a comprehensive evaluation is performed on a 128x128 FeFET AND-type memory array in terms of area, latency, power and throughput. Compared with the AES-based scheme, our scheme shows around 22.6x/14.1x increase in encryption/decryption throughput with negligible power penalty. Furthermore, we evaluate the performance of our scheme over the AES-based scheme when deploying different neural network workloads. Our scheme yields significant latency reduction by 90% on average for encryption and decryption processes

    FPGA ARCHITECTURE FOR 2D DISCRETE FOURIER TRANSFORM BASED ON 2D DECOMPOSITION FOR LARGE-SIZED DATA

    Get PDF
    ABSTRACT Applications based on Discrete Fourier Transforms (DFT) are extensively used in various areas of signal and digital image processing. Of particular interest is the two-dimensional (2D) DFT which is more computation-and bandwidth-intensive than the one-dimensional (ID) DFT. Traditionally, a 2D DFT is computed using Row-Column (RC) decomposition, where ID DFTs are computed along the rows followed by ID DFTs along the columns. Both application specific and reconfigurable hardware have been used for high-performance implementations of 2D DFT. However, architectures based on RC decomposition are not efficient for large input size data due to memory bandwidth constraints. In this paper, we propose an efficient architecture to implement the 2D DFT for largesized input data based on a novel 2D decomposition algorithm. This architecture achieves very high throughput by exploiting the inherent parallelism due to the algorithm decomposition and by utilizing the row-wise burst access pattern of the external memory. A high throughput memory interface has been designed to enable maximum utilization of the memory bandwidth. In addition, an automatic system generator is provided for mapping this architecture onto a reconfigurable platform of Xilinx Virtex5 devices. For a 2K x 2K input size, the proposed architecture is 1.96x times faster than RC decomposition based implementation under the same memory constraints, and also outperforms other existing implementations

    Powering Disturb-Free Reconfigurable Computing and Tunable Analog Electronics with Dual-Port Ferroelectric FET

    Full text link
    Single-port ferroelectric FET (FeFET) that performs write and read operations on the same electrical gate prevents its wide application in tunable analog electronics and suffers from read disturb, especially to the high-threshold voltage (VTH) state as the retention energy barrier is reduced by the applied read bias. To address both issues, we propose to adopt a read disturb-free dual-port FeFET where write is performed on the gate featuring a ferroelectric layer and the read is done on a separate gate featuring a non-ferroelectric dielectric. Combining the unique structure and the separate read gate, read disturb is eliminated as the applied field is aligned with polarization in the high-VTH state and thus improving its stability, while it is screened by the channel inversion charge and exerts no negative impact on the low-VTH state stability. Comprehensive theoretical and experimental validation have been performed on fully-depleted silicon-on-insulator (FDSOI) FeFETs integrated on 22 nm platform, which intrinsically has dual ports with its buried oxide layer acting as the non-ferroelectric dielectric. Novel applications that can exploit the proposed dual-port FeFET are proposed and experimentally demonstrated for the first time, including FPGA that harnesses its read disturb-free feature and tunable analog electronics (e.g., frequency tunable ring oscillator in this work) leveraging the separated write and read paths.Comment: 32 page

    Editorial

    No full text

    Going Vertical: The Future of Electronics

    No full text

    Introduction to Special Issue on Neuromorphic Computing

    No full text
    • …
    corecore